Skip to main content

Geocities Datasets

Sub-collection of Web Archive Datasets


rss RSS

1
RESULTS


Show sorted alphabetically

Show sorted alphabetically

SHOW DETAILS
up-solid down-solid
eye
Title
Date Archived
Creator
Geocities Datasets
by Nick Ruest
data

eye 287

favorite 9

comment 0

Web archive derivatives of the GeoCities collection from the Internet Archive. The derivatives were created with the Archives Unleashed Toolkit 1.1.0, and align with the derivatives produced by Archives Research Compute Hub (ARCH). The CSV derivatives include: Domain frequency domain count Domain graph crawl_date source target count Image graph crawl_date source url alt_text Web graph crawl_date source target anchor_text Binary information Audio, pdf, spreadsheet, powerpoint, video, word...
Topics: csv, apache spark, geocities, web archives